SQL Practice Hub

1. Relational Model & Normalization

Data as relations (tables) with keys ensuring integrity.

Keys

Primary key uniquely identifies row. Foreign key enforces referential link.

Integrity

Normalization

1NF atomic, 2NF full dependency, 3NF remove transitive, BCNF stronger determinants.

Denormalization

Performance trade: duplicates for fewer joins; maintain with routines or views.

Constraints

NOT NULL, UNIQUE, CHECK, DEFAULT, FOREIGN KEY; enforce invariants at storage.

Sample Schema

CREATE TABLE users(
  id SERIAL PRIMARY KEY,
  email TEXT UNIQUE NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE orders(
  id SERIAL PRIMARY KEY,
  user_id INT REFERENCES users(id),
  total_cents INT CHECK(total_cents >= 0)
);

2. Core Querying & Joins

Think in sets; express desired result not loops.

JOIN Types

INNER: intersection
LEFT: preserve left
RIGHT: preserve right
FULL: union w/ nulls
CROSS: cartesian

Filtering & Grouping

WHERE filters rows before grouping
HAVING filters groups after aggregate
ORDER BY after SELECT projection

Window Functions

ROW_NUMBER(), RANK()
PARTITION BY groups logical window
Frame clauses refine subset

Set Ops

UNION (distinct)
UNION ALL (no dedupe)
INTERSECT, EXCEPT

JOIN Example

SELECT u.email, o.total_cents
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE o.total_cents > 5000
ORDER BY o.total_cents DESC;

3. Aggregation, Windows & Transactions

Summarize data then layer analytic context & consistency.

Aggregation

COUNT(*), SUM(col), AVG()
GROUP BY keys define grain
Distinct counts expensive; sample+estimate

Windows

SUM(val) OVER (PARTITION BY k ORDER BY t ROWS 3 PRECEDING)
Running totals & moving averages

Transactions

BEGIN / COMMIT / ROLLBACK
Isolation: Read Uncommitted → Serializable
ACID: atomicity, consistency, isolation, durability

Locking

Row vs table locks
MVCC reduces read contention
Deadlock detection & retry

Window Example

SELECT user_id,
       total_cents,
       SUM(total_cents) OVER (PARTITION BY user_id ORDER BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS rolling_3
FROM orders;

4. Indexing & Execution Plans

Accelerate lookups, maintain write cost awareness.

Index Types

B-Tree (range, equality)
Hash (equality only)
GIN / GiST (full-text, arrays)

Selectivity

High selectivity = few rows = better candidate
Composite index order matters (left prefix)

Execution Plans

EXPLAIN / EXPLAIN ANALYZE
Seq Scan vs Index Scan
Filter vs Index Condition

Anti-Patterns

Over-index (write penalty)
Functions on indexed column (unless expr index)
Leading wildcard LIKE (no use of index)

Index Example

CREATE INDEX idx_orders_user_total ON orders(user_id,total_cents);
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id=5 AND total_cents>1000;

6. Query Patterns Cheat Sheet

Reusable structural templates.

Top-N Per Group

SELECT * FROM (
  SELECT *, ROW_NUMBER() OVER(
    PARTITION BY user_id ORDER BY total_cents DESC) rn
  FROM orders
) t WHERE rn <= 3;

Upsert (Postgres)

INSERT INTO users(id,email)
VALUES(1,'a@x')
ON CONFLICT (id)
DO UPDATE SET email=EXCLUDED.email;

Pivot via FILTER

SELECT
  user_id,
  SUM(total_cents) FILTER (WHERE total_cents > 10000) AS high,
  SUM(total_cents) FILTER (WHERE total_cents <= 10000) AS low
FROM orders GROUP BY 1;

Running Total

SELECT id,total_cents,
 SUM(total_cents) OVER (ORDER BY id) run
FROM orders;

Gap Detection

SELECT id, prev_id, id-prev_id-1 AS gap
FROM (
  SELECT id, LAG(id) OVER(ORDER BY id) prev_id
  FROM orders
) t WHERE id-prev_id > 1;

Percentile

SELECT percentile_cont(0.9)
WITHIN GROUP (ORDER BY total_cents)
FROM orders;

7. Review & Mastery

Checklist & conceptual Q&A.

Progress Checklist

Explain normalization up to 3NF Choose appropriate join for scenario Use window function for analytic metric Interpret an EXPLAIN plan difference Recommend index for selective predicate Distinguish ACID properties clearly Write safe upsert (conflict resolution) Identify when to denormalize

Concept Q&A

Why window over self-join for running totals?

Windows scan once with frame semantics; self-joins duplicate row counts & increase complexity.

Primary vs unique key?

PRIMARY KEY = NOT NULL + UNIQUE plus singular table identity; multiple UNIQUE constraints allowed.

High cardinality index issues?

Generally beneficial for selectivity; downside is write amplification & larger index memory footprint.

Why analyze execution plans?

Reveal access paths (seq vs index), join algorithms, row estimates; mismatches hint statistics problems.